Introduction to acoustic analysis


This guide has a number of direct hyperlinks: I recommend that you right-click and open them in new browser tabs.

1 Prerequisites

Please follow the following instructions before the day of the practical. You can contact me at if you run into trouble.

1.1 Install Sonic Visualiser

We will be using this application to view and annotate the contents of audio files. Go to this website and download the installer for your platform (i.e., Windows, Mac, or a Linux distribution). Navigate to the downloaded file, execute it and follow any instructions. You can check that the installation has been successful by right-clicking on an audio file (e.g., .wav) and choosing ‘open with’, then ‘Sonic Visualiser’.

There are some demo videos here, including one on how to install this software on Mac, and this is the reference manual.

1.2 Install R dependencies

Double-click to open the bioacoustics-practical.Rproj file in RStudio. Now open the intro-vignette.Rmd file from RStudio and execute the code chunk below. You can do this by clicking the green ‘play’ button to the right of the chunk or by pressing ctrl/cmd+shift+enter. This will install the code libraries required for this practical should you not have them already. Depending on the speed of your internet connection this might take up to a few minutes.

dependencies = c(
    "warbleR",
    "dplyr",
    "magrittr",
    "stringr",
    "ggrepel"
  )
# Load or install packages 
packages = lapply(dependencies, function(y) {
  if (!y %in% installed.packages()[, "Package"])  {
    install.packages(y)
  }
  try(require(y, character.only = T), silent = T)
})

2 General introduction

2.1 Introducing the dataset

Navigate to data/audio-files in your project folder, bioacoustics-practical.

These audio files are sample vocalisations for 15 bird species across a wide body mass range — from the Goldcrest’s 5.8 grams of sheer adorableness to the much graver looking Rook, over 70 times larger.

Goldcrest picture (left) by Cliff Watkinson, Rook (right) by hedera.baltica. Both CC BY-SA 2.0

During this activity you will:

  • try to guess the species that produced each of these vocalisations,
  • extract and analyse basic acoustic information to test hypotheses involving sound, and
  • learn how to perform more complex audio feature extraction and analysis to classify vocalisations

2.2 Visualising sound

Right-click the first file and choose ‘open with’, then ‘Sonic Visualiser’. You might want to make this software your default option to open .wav files; this will save you having to do this every time you want to open a new file. (?)

Once the file is open you should see a waveform at the bottom of the window.

A waveform shows changes in amplitude over time

Press W on your keyboard (alternatively, click on the ‘Pane’ menu, then ‘Add Waveform’). You will see a second waveform, this time with greater temporal resolution.

  • Play the sound by pressing your spacebar.
  • Scroll to increase or decrease the time range.

This is a useful visualisation if we want to see how ‘loud’ vocalisations are at a particular point in time. But it does not give us any information about the spectral characteristics of the sound, that is, about how much vibration there is at each different frequency.

For this reason, researchers working with sound often use spectrograms, sometimes also called sonographs. You can take a minute to play around in this website to see how a spectrogram works. Notice how the y-axis represents a range of frequencies, the x-axis shows time, and colour encodes amplitude, or how ‘loud’ each point is.

Example of a spectrogram like the ones we will be making during the session

Now, back in Sonic Visualiser,

  • You can now remove the waveform pane (right-click > delete layer until there are none left) and add a new spectrogram pane by pressing G.
  • Change the Window parameter to 512 or 1024 and Window overlap, to its right, to 93.75%. Here is brief explanation of these parameters affect the resolution of the spectrogram, should you be interested. Play around with these - the different species in the dataset have been recorded at different sample rates, so the optimal parameters will change.
  • Use the zoom wheels to zoom in or out in the x and y axes.
  • Change the colour palette in the Colour tab if you want.
  • Use the first wheel to the right of the colour tab to play with the colour threshold; it is useful to adjust this until the background noise disappears (see gif below)
  • Once you are happy with how the spectrogram looks you can click on File > Export Session as Template, give it a name, and select the option to set it as default. This will spare you from having to do this every time you open a new file.

Spectrogram thresholding

3 Identifying some widespread UK birds

Navigate to ./data/audio-files in your project folder, bioacoustics-practical.

These are recordings of songs and vocalisations of 15 species of birds. These are selected as species that we should hear in and around Oxford in the next two weeks, but are just identified with file names 1.mp3 to 15.mp3.

  1. Your task is to work out the identity of the 15 species in the recordings & record these in your bird-ids.csv list, which can be found in the ./data folder. Some of you may want to have a go based on your existing knowledge, but even if so, please go to xeno-canto and search for the species and listen to listed recordings of song to double check.

  2. If you’re less sure, please open the rand-species-names.csv file which has the (randomised) species names in them. This will give you a list of species to match up with the recordings.

  3. Go to xeno-canto and enter a species names in the search box at the top – in this case you can see there are 113 hits for Great Tinamou and 6044 for Great Tit – it is the latter that we want!

  1. Entering “Great Tit” returns a new page with a map of the species range and the geographic location of the recording.

  2. As there may be geographical variation in songs, zoom in on the UK; you’ll note that the dark blue dots on the UK correspond to the subspecies ‘newtoni’ on the key to the right – this is usually accepted as a subspecies endemic to the UK for great tit. Don’t worry if there is no subspecies endemic to the UK.

  3. Click on the name of the UK subspecies (in this case newtoni) and it will filter the recordings by location to a UK list. Note that these are recordings of songs and calls, and we’re asking you to identify (mostly) songs, and that the recordings are scored on quality (in the “Actions” column from “A” – highest to “E” – lowest).

We’ve allowed c. 45 minutes for this. If any of you finish much ahead of this, please feel free to explore the functionality of xeno-canto. There are also many recordings of songs and calls on eBird: simply enter the species name that you want.

Please note:

  • The xeno-canto website has had performance problems of late - if it’s acting up you can alternatively use the Macaulay Library and eBird.
  • Make sure that you do not modify the layout of the bird-ids.csv file beyond entering your guesses in the ‘name’ column, and that you save it as a .csv when you are done.

If you have a lot of time and want quite a tough challenge, try one of the eBird audio quizzes here. We’ll use these species to carry out some comparative analysis of song properties in the next section.

4 Measuring sound

We are now going to explore a real hypothesis concerning the frequencies at which birds produce their vocalisations. In the process, you will learn to manually extract basic information from sound recordings.

4.1 The problem

We know that the size of the vibrating structure that produces a sound influences its frequency, so we might expect that A, body size, be correlated with B, the morphology of the vocal apparatus, which would, in turn, influence C, the frequency of a bird’s vocalisations. We cannot easily investigate B in the course of this activity, but we can test whether A and C are themselves correlated — which would provide some support for this idea.


Q1: If there is indeed a correlation between the size of a bird and the frequency of its vocalisations, of which sign would you expect it to be?

Q2: What other factors might lead to differences in frequency across different bird species


To test this hypothesis — that body size affects the frequency of vocalisations — you will need to extract some basic spectral information from these vocalisations:


4.2 Extracting time and frequency data from sound files

  • Click on Layer > Add New Boxes Layer in a Sonic Visualiser window (right click on the spectrogram or go to the top menu. You can now draw boxes over the elements of interest, and erase any box that you are not happy with.

Example of boxes bounding the discrete elements in a vocalisation

  • Adjust the colour threshold (as explained above) until only the brightest parts of the image are visible. This will help you focus on the frequencies with the highest amplitude, or peak frequencies, which is the song trait that we will analyse here.

This is the range where the peak frequencies are in this vocalisation

  • Draw boxes over a representative sample of the acoustic elements present in a recording. Do this for each discrete type of sound that you see, but you do not need to include more than one or two examples per type. Indeed, for swiftness’ sake, you shouldn’t. You do not need to segment more than ~ 10 elements per audio file: a small number will already capture important information.

  • Once you are done with a recording, press ctrl+Y/cmd+Y, or click on File > Export Annotation Layer. Give this file a name — specifically, the number of the file + .csv, for example, 1.csv — and save it in the /data/frequency-data/ folder within this project’s folder. Leave the option to include a header unticked.

  • You can now close the current file (no need to save the session) and open the next one.


While you work through the vocalisations, notice how some birds have simple, pure-tone vocalisations, while those of others have a multi-frequency harmonic spectrum, with a fundamental frequency and many harmonic overtones. Still other species combine these two types skilfully.

Q: Can you find examples of each of these in the dataset? What might be the mechanisms that produce these differences? Here is a paper about this if you want to read more.


4.3 Visualising the results using R

Now we will import the data that you have just generated using R, get the mean frequency for each species, and plot our variables of interest — mean frequency and body mass.

  • Open the R project in Rstudio. This is the bioacoustics-practical.Rproj file. Once you are in Rstudio, open the full-vignette.Rmd and navigate to this section. If you press Ctrl/Cmd+Shift+O while the Rstudio window is active you will get an outline of the document, which might help you find your way more easily.

  • First, the following code chunk will install the code libraries required for this practical, should you not have them already:

dependencies = c(
    "warbleR",
    "dplyr",
    "magrittr",
    "stringr",
    "ggrepel",
    "patchwork"
  )
# Load or install packages 
packages = lapply(dependencies, function(y) {
  if (!y %in% installed.packages()[, "Package"])  {
    install.packages(y)
  }
  try(require(y, character.only = T), silent = T)
})
  • Now we will import the data that you manually extracted from the audio files.
# Import the vocalisation frequency data
body_mass_df = read.csv(file.path(getwd(), "data", "bird-ids-private.csv"), header = TRUE)

freq_data_files = list.files(file.path(getwd(), "data", "frequency-data"))

freq_list = list()
for (file in freq_data_files) {
  df = read.csv(file.path(getwd(), "data", "frequency-data", file), header = FALSE)[3:4] # get the min/max ferquency data
  global_mean = mean(rowMeans(df, na.rm = TRUE)) # Get mean for each element and then global mean
  freq_list[gsub("\\.csv$", "", file)] = global_mean
}

freq_df = stack(freq_list) %>% # make a dataframe with the frequency and ID info
  rename(file = ind, mean_frequency = values)
  • The next step is to read the body mass data and your species guesses.
body_mass_df = read.csv(file.path(getwd(), "data", "body-mass-data.csv"), header = TRUE)

ids_df = read.csv(file.path(getwd(), "data", "bird-ids.csv"), header = TRUE) %>% 
  replace(is.na(.), "Unknown") %>% 
  mutate(name = str_replace(str_to_sentence(name), "_", " "), file = as.factor(file))

# Now put everything together in a single dataframe
full_df = cbind(ids_df, body_mass_df) %>% left_join(., freq_df) %>% 
  mutate(across(where(is.numeric), ~ round(., 2)))
  • You can take a look at the final dataset:
knitr::kable(full_df)
file name body_mass mean_frequency
1 Unknown 117.50000 2566.1198
2 Unknown 450.00000 489.8137
3 Unknown 9.00000 6475.6497
4 Unknown 453.50000 1778.9409
5 Unknown 18.00000 5177.9416
6 Unknown 20.25000 4499.4619
7 Unknown 18.70000 3791.2291
8 Unknown 10.35000 6787.0895
9 Unknown 10.27143 5856.3195
10 Unknown 300.00000 397.8008
11 Unknown 5.80000 6899.7876
12 Unknown 15.10000 5191.9779
13 Unknown 24.56667 3620.6775
14 Unknown 103.20000 1971.4696
15 Unknown 67.75000 3289.6369
  • Let’s now inspect the variables separately:
# Put some plot settings into a function so that we can reuse them:
bioac_theme = function() {
    theme(
    panel.background =  element_rect(fill = '#f5f5f5', colour = '#f5f5f5'),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank(),
    axis.ticks.y = element_blank(),
    axis.ticks.x = element_blank(),
    legend.position = "none",
    text = element_text(size = 15),
    plot.title = element_text(vjust = 0.1),
    
  )
}

# Plot each variable separately

bmass_plot = full_df %>% ggplot(aes(x=body_mass)) + 
  geom_density(fill='#bfbfbf', colour='#bfbfbf') + 
  geom_rug(length = unit(0.045, "npc")) + 
  theme(aspect.ratio = 1) +
  bioac_theme() +
  theme(plot.margin = unit(c(0,30,0,0), "pt")) +
  labs(
    x = '\nBody mass (g)',
    y = 'Density\n'
  )

freq_plot = full_df %>% ggplot(aes(x=mean_frequency)) + 
  geom_density(fill='#bfbfbf', colour='#bfbfbf') + 
  geom_rug(length = unit(0.045, "npc")) + 
  theme(aspect.ratio = 1) +
  bioac_theme()+
    labs(
    x = '\nMean frequency (Hz)',
    y = element_blank()
  )

bmass_plot + freq_plot + plot_annotation(
    title = 'Smoothed histograms (KDEs) of our two variables\n',
    caption = '\nKDE = Kernel Density Estimation, not the right method to use here but useful for vis. purposes',
    theme = theme(plot.title = element_text(size = 18, hjust = 0.5), plot.caption = element_text(size = 7, hjust = 0.5))
  ) 
Distribution of body mass (left) and mean vocalisarion frequency (right)

Figure 4.1: Distribution of body mass (left) and mean vocalisarion frequency (right)

4.4 Results

  • Finally, take a look at how the variables are related:
# Plot the data
full_df %>%
  mutate(name = str_replace(str_to_sentence(name), "_", " ")) %>%  # Remove underscore and capitalise species names
  ggplot(aes(x = body_mass, y = mean_frequency)) +
  geom_smooth(
    method = 'lm',
    colour = 'grey',
    fill = 'grey',
    alpha = 0.21
  ) +
  geom_point(size = 3) +
  scale_x_log10() +
  geom_text_repel(
    aes(label = name),
    seed = 666,
    min.segment.length = 0,
    box.padding = 0.5,
    color = 'black',
    size = 3.3,
    force = 10,
    alpha = 0.6,
    max.time = 1,
    segment.curvature   = 0.2,
    segment.alpha   = 0.6,
    fontface = 'italic'
  ) +
  labs(
    title = 'Mean frequency vs body mass\n',
    x = '\nLog body mass (g)',
    y = 'Frequency (Hz)\n'
  ) +
  theme(aspect.ratio = 0.6, plot.title = element_text(hjust = 0.5)) +
  bioac_theme()
Relationship bethween mean frequency and body mass in a sample of 15 UK birds

Figure 4.2: Relationship bethween mean frequency and body mass in a sample of 15 UK birds

Discuss!

The allometric relationship between the frequency of bird vocalizations and the size of their bodies arises because of physical and energetic constraints: animals can not easily produce sound waves larger than the size of their body or their sound-producing apparatus.

Departures from a negative allometric relationship between frequency of acoustic signals and body size may then reflect (1) differences in evolutionary history that caused variation in syrinx or vocal tract morphology (phylogenetic constraints) and (2) differences in costs or benefits of producing low‐frequency sounds. Thus, variation in frequency may inform about current or past selection on acoustic signals — Mikula et al. 2020

Distribution of peak song frequency across a passerine phylogeny. Source: Mikula et al. 2020

Further reading:

Mikula, P., Valcu, M., Brumm, H., Bulla, M., Forstmeier, W., Petrusková, T., Kempenaers, B. and Albrecht, T. (2021), A global analysis of song frequency in passerines provides no support for the acoustic adaptation hypothesis but suggests a role for sexual selection. Ecology Letters, 24: 477-486. https://doi.org/10.1111/ele.13662

Ryan, M. J., & Brenowitz, E. A. (1985). The Role of Body Size, Phylogeny, and Ambient Noise in the Evolution of Bird Song. The American Naturalist, 126(1), 87–100. https://doi.org/10.1086/284398

5 I need a title for the thrid part

6 re F0 etc

This is a problem that can be to some extent solved using more complex, automated extraction methods, but it is a very difficult issue. For now, you can add contrast to the spectrogram and select what you think might be the ‘frequency’? with highest amplitude (the loudest)